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Progress  during  Year  2  of  **Deterniination  of  Multiple 
Sound  Sources'* 


AFOSR  GRANT  F49620-92-j-0489-P.L  William  A.  Yost-9/93  to  9/94. 

Three  major  projects  completed  during  the  funding  period  of  September  1993  to  September  1994 
are  described  below.  This  progress  report  ends  wrath  a  list  of  publications  for  the  same  period. 

The  Cocktail  Party  Problem:  Forty  Years  Later 

William  A.  Yost 


Forty  years  ago,  Collin  Cheny  (J.  Acoust.  Soc.  Am.  25, 975-981, 1954)  described  the  "cocktail 
party  problem,"  and  he  suggested  that  spatial  iKaring  was  a  major  method  used  by  the  auditory 
system  to  separate  sound  sources  in  a  multisource  acoustic  environment.  A  through  review  of  the. 
past  40  years  of  spatial  hearing  studies  that  involve  more  than  one  sound  source  or  potential 
sound  sources  was  completed  in  a  attempt  to  determine  the  role  spatial  hearing  plays  in  sound 
source  s^egation.  Almost  all  of  the  published  data  involve  only  two  sound  sources,  and  the 
results  indicate  that  spatial  hearing  may  not  be  the  nuyor  cue  us^  for  sound  source  segregation.  : 
However,  there  are  very  few  studies  t^t  have  investigated  the  cocktail  party  problem  in  real- 
world  listening  conditions,  especially  when  there  are  more  than  two  sound  sources. 


Divided  Auditory  Attention  With  Up  To  Three  Sound  Sources:  A  Cocktail 
Party 

William  A.  Yost,  Stanley  Sheft,  and  Raymond  (Toby)  Dye 

In  1953  Cherry  (J.  Acoust.  Soc.  Am.  25, 975-981,  1953)  wrote, "how  do  we  recognize  what  one 
person  is  saying  when  others  are  speaking  at  the  same  time  (the  'cocktail  party  problem*)?"  Cherry 
assumed  that  biiuuiral  hearing  was  a  major  solution  for  the  cocktail  party  problem.  There  have 
been  few  studies  (especially  in  real-world  listening  environments)  that  directly  measure  the  role  of 
binaural  processing  in  divided  attention  tasks  that  characterize  a  cocktail  party.  In  this  study 
listeners  were  asked  to  identify  words,  letters,  or  numbers  (word  idmtification)  that  wo'e 
presented  over  loudspeakers  and  the  location  of  each  loudspeako*  (word  location)  that  presented 
the  utterance.  Performance  was  measured  in  three  conditions;  1)  natural  listening,  in  which  the 
words  and  letters  were  fn^esented  over  loudspeakers  in  a  sound  deadoi  room  in  which  the  listener 
was  seated;  2)  single  microphone/headphone  listening,  in  which  a  single  microphone  located  at  the 
position  of  the  listener  sent  the  sounds  to  a  single  headphone  of  the  listmo*  was  seated  in  a 

remote  sound  proof  room;  and  3)  KEMAR  listening,  in  which  KEMAR  was  "seated"  where  the 
listener  would  have  been  and  the  binaural  outputs  from  KEMAR  were  fed  to  stereo  headphones 
of  the  listener  seated  in  the  remotely  located  sound  proof  room. 


METHODS 


There  were  seven  loudspeakers  in  the  room  equally  spaced  around  the  frontal  hemisphere  1.3  m 
from  the  listener  at  a  height  of  1 .2  m  (tq)proximately  the  hdght  of  the  head  of  the  listener  vdien 
seated).  The  room  is  3.5  m  long,  by  2.5  m  wide,  by  2. 1  m  high,  constructed  out  of  sound  deaden 


office  partitions  and  lined  with  sound  attenuating  foam  on  all  surfaces.  Additional  sound 
attenuating  foam  was  placed  in  the  room  at  various  locations  (e.g.  directly  behind  the  listener)  in 
order  to  equalize  the  reflections  measured  at  the  location  of  the  listener  for  each  of  the  seven 
loudspeakers.  The  amplitude  of  the  first  reflection  ranged  fi'om  -17  and  -  21  dB  re:  the  source 
across  the  seven  loudspeakers,  the  loudspeaker  outputs  were  matched  in  dB  A  output,  and  the 
seven  loudspeakers  were  chosen  to  be  within  2  dB  of  each  other  in  the  spectral  region  of  100  to 
7000  Hz  (the  approximate  bandwidth  of  the  stimuli).  No  other  attempts  were  made  to  equalize 
the  loudspeakers,  since  we  wanted  to  create  a  somewhat  real-world  listening  environment. 

The  speech  materials  were  42  NU-6  words,  the  26  letters  of  the  alphabet,  and  the  numbers  1  to  9 
spoken  by  seven  male  talkers.  Three  judges  listened  to  the  words  and  letters  to  determine  their 
intelligibility.  Additional  recording  of  the  words,  letters,  or  numbers  were  made;  or  replacement 
words  were  chosen  until  all  three  judges  felt  that  the  utterances  from  all  seven  talkers  were 
equally  intelligible. 

During  each  test  condition  each  loudspeaker  presented  a  word  of  a  unique  male  talker  (e.g. 
loudspeaker  1  would  always  present  the  utterances  of  male  talker  2,  etc.).  For  each  of  the 
listening  environments,  there  were  three  listening  conditions;  1)  one-at-a-time,  in  which  a  single 
word,  letter,  or  number  was  presented  from  a  loudspeaker  chosen  randomly  from  trial  to  trial;  2) 
two  at  a  time,  in  which  two  words,  letters,  or  numbers  were  presented  simultaneously  one  from 
each  of  two  loudspeakers  that  were  chosen  randomly,  3)  three  at  a  time  in  which  three  words, 
letters,  or  numbers  were  presented  simultaneously  one  from  each  of  three  loudspeakers  that  were 
chosen  randomly.  In  each  of  these  conditions  there  were  three  lists:  1)  42  NU-6  words  (6  words 
for  each  of  the  seven  talkers)  presented  the  first  time  (Wly),  2)  the  same  NU-6  words  were 
presented  agtun  but  in  a  different  random  order  (this  repetition  allowed  an  estimate  of  learning) 
(W2y),  and  3)  the  26  letters  and  9  numbers  were  presented  (let).  The  first  time  through  each  list 
the  listener  was  asked  to  enter  into  the  computer  keyboard  all  of  the  words  they  heard.  They 
could  listen  as  many  times  to  the  words,  letters,  or  numbers,  as  they  wanted  and  the  number  of 
times  they  listen  was  recorded.  After  they  fimshed  indicating  the  words,  letters,  or  numbers  they 
heard;  the  utterances  were  repeated  in  exactly  the  same  order  and  they  were  to  indicate  from 
which  loudspeaker  (1-7)  they  heard  a  utterance  (the  actual  word,  letter,  or  number  they 
responded  with  during  the  word  identification  part  of  the  experiment  was  shown).  Five  listeners 
participated  in  each  base  condition,  in  which  the  base  condition  was  listening  in  one  of  the  three 
environments  and  for  the  1,  2,  or  3  at  time  conditions  (45  total  listeners). 

All  utterances  were  lowpassed  filtered  at  7,000  Hz  (they  weie  played  out  at  16,000  Hz  rate),  were 
normalized  to  the  same  rms  level,  and  when  more  than  one  word  at  a  time  was  presented  the 
utterances  were  temporally  align  so  that  the  temporal  middle  of  all  utterances  were  the  same. 

Since  the  difference  in  duration  of  the  utterances  was  maximally  128  ms,  the  maximum  onset 
(offset)  separation  was  64  ms.  All  utterances  were  presented  at  70  dB  A  and  mixed  with  a 
continuous  broadband  noise  presented  at  65  dBA. 

The  data  for  the  letters  and  numbers  were  scored  once  for  total  correct  as  responded  by  the 
listeners.  For  the  words  there  were  three  levels  of  analysis:  1)  Level  1  (Wxl)-scored  directly  as 
the  listener  recorded  their  answers;  2)  Level  2  (Wx2)-corrections  were  made  for  spelling  and 
homonyms  (e.g.  dear  for  deer),  3)  Level  3  (Wx3)-additional  corrections  were  made  for  words  or 
near  words  that  might  have  been  the  combination  of  the  words  presented  (e.g.  mop  and  fall 
yielding  mall),  for  close  homonyms  (e.g.  drain  for  rain),  and  su^es  added  by  the  listener  that 
were  not  in  the  spoken  word  (e.g.  homes  for  home). 


RESULTS-DISCUSSION 


Performance  decreases  from  the  1-at-a-time  to  the  2-at-a-time  to  the  3-at-a-time  listening 
conditions.  Performance  is  best  in  the  normal  listening  environment,  worse  in  the  one-mic  and 
one-phone  environment,  and  intermediate  for  the  KEMAR  environment.  These  differmces  are 
greater  for  the  word  identification  tasks  (Wxy)  than  for  the  letter  and  number  identification  tasks 
(let).  Localization  performance  is  very  good  in  the  normal  oivironment,  fairly  accurate  with 
KEMAR,  and  at  or  barely  above  chance  (chance  is  1/7  or  14.3%)  in  the  one-mic  and  one-phone 
environment.  Some  listeners  did  perform  above  chance  in  that  they  recognized  some  of  the  voices 
and  assigned  them  a  consistent  speaker  number  and  some  of  the  times  this  assignment  was 
correct.  Such  a  strategy  also  meant  that  some  listeners  also  scored  below  chance. 

In  the  normal  listening  condition  localization  performance  is  best  for  the  loudspeakers  directly  in 
front  of  the  listener  (4,S,  and  6),  and  performance  is  lower  for  the  side  loudsp^ers  (1,2,6,  and 
7).  Because  the  computer  terminal  and  keyboard  were  directly  in  front  of  the  listener,  they  were 
most  likely  facing  toward  loudspeaker  4  during  the  experiment  (KEMAR  also  feced  loudspeaker 
4).  The  brief  duration  of  each  sound  would  have  made  it  unlikely  that  significant  head  movements 
would  have  occurred.  Thus,  the  data  are  consistent  with  poor  localization  acuity  for  sounds  off  to 
one  side. 

For  the  normal  and  KEMAR  listening  conditions  performance  improves  the  fiirther  apart  the 
sounds  from  the  two  or  three  loudspeakers  become.  There  is  a  sinall  change  in  performance  as  a 
function  of  loudspeaker  separation  in  the  one  mic-one  phone  conditions. 

In  the  KEMAR  conditions  listener's  reported  fiustrations  in  not  being  able  to  turn  toward  the 
perceived  location  of  an  utterance.  Many  listeners  believed  this  inability  hindered  their 
performance. 

CONCLUSIONS 

Binaural  hearing  does  seem  to  play  a  role  in  divided  attention  tasks  that  characterize  the  cocktail 
party  problem.  This  appears  to  be  especially  true  when  the  listener  does  not  have  a  great  deal  of 
familiarity  with  the  messages  and  when  thrw  words  were  presented  simultaneously.  Coupling  a 
listener's  head  movements  to  the  position  of  KEMAR  relative  to  the  sound  sources  might  improve 
performance  when  listening  in  the  KEMAR  environment.  Thus  in  answer  to  Cherry's  question, 
"On  what  logical  basis  could  one  design  a  machine  CfilteiO  for  carrying  out  such  an  operation 
(solving  the  cocktail  party  problem)?";  spatial  hearing  does  provide  one  such  logical  basis. 

Analytic  and  Synthetic  Listening 
R.H.  Dye,  William  A.  Yost,  and  Stanley  Sheft 
I.  INTRODUCTION 

A  major  role  for  hearing  is  to  determine  the  sources  of  sounds.  Since  the  sounds  from  many 
sources  are  combined  into  one  complex  sound  fidd  as  the  input  to  the  auditory  system,  the 
auditory  system  must  determine  what  aspects  of  this  sound  field  are  unique  to  each  sound  source. 
Analytic  listening  indicates  that  a  listener  is  able  to  segregate  the  information  in  a  sound  field 
according  to  sources,  while  synthetic  listening  rq)resents  a  failure  to  s^grqgate.  A  new  procedure. 


the  Synthetic  /Analytic  Listening  Task  (SALT),  is  used  to  measure  analytic  and  synthetic  listening 
in  two  tasks:  a  lateralization  task  in  which  interaural  time  diffo’ences  are  the  basis  for  sound 
source  segregation,  and  a  modulation  discrimination  task  in  which  anq>litude  modulation  is  the 
baas  for  sound  source  segregation.  The  SALT  procedure  is  more  use^  in  describing 
performance  related  to  sound  source  determination  than  are  the  more  traditional  procedures  based 
on  measures  of  threshold. 

n.  LATERALIZATION 

A.  THE  SYNTHETIC/ANALYTIC  LISTENING  TASK. 

For  the  lateralization  work  each  trial  consisted  of  two  intervals,  with  the  first  providing  a 
diotic  presentation  of  a  753-Hz  cue  tone  that  served  to  indicate  the  intracranial  midline  and  the 
target  fiequency.  The  second  interval  presented  the  test  agnal,  which  was  a  two-tone  complex 
comprised  of  the  753-Hz  target  and  a  SS3-Hz  distractor.  Data  were  gathered  in  blocks  of  100 
trials,  with  the  target  and  distractor  each  presented  at  ten  different  interaural  delays  that  ranged 
from  left-leading  to  right-leading  and  were  symmetrically  placed  about  zero.  Each  possible  pairing 
of  target  and  distractor  delay  was  presented  once,  in  a  random  order,  during  each  block  of  trials. 
Subjects  were  instructed  to  indicate  whether  tte  target  component  appeared  to  the  left  or  right  of 
the  intracranial  midline  as  marked  by  the  cue  tone  presented  during  tl»  first  interval.  Feedback 
concerning  the  position  of  the  target  was  provided  to  listeners  on  a  trial-by-trial  basis. 

The  durations  of  the  signals  were  25,  50, 100, 200  or  400  ms,  with  the  target  and 
distractor  gated  simultaneously  with  10-ms  rise-decay  times.  Kme  listeners  were  run  under 
conditions  in  which  the  interaural  delays  of  the  target  and  distractor  ranged  from  -90  a<s  to  +90  ijs 
in  20-a<s  steps,  with  negative  delays  indicating  left-leading  signals  and  positive  delays  indicating 
right-leading  signals.  The  level  of  each  component  was  53  dB  SPL.  Within  a  block  of  100  trials, 
the  duration  was  fixed.  Matrices  of  left-right  judgments  were  generated  for  each  block  of  trials, 
with  target  delay  plotted  on  the  abscissa  and  distractor  delay  plotted  on  the  ordinate. 

B.  ANALYSIS  OF  LEFT-RIGHT  RESPONSES. 

Extracted  from  each  composite  matrix  of  responses  is  the  slope  and  x-intercept  of  the 
best-fitting  linear  boundaries  between  left  and  right  responses.  Assume  that  the  interaural  delay  of 
the  target  and  the  interaural  delay  of  the  distractor  are  related  by  separate  functions  to  the 
percepts  arising  firom  each  (perceived  laterality,  in  this  case), 

X,  =  f(IDT^,Yj  =  f(IDTpj)  (1) 

where  Xj  is  the  percept  associated  with  the  ith  interaural  difference  of  time  of  the  target  and  Yj  is 
the  percept  associated  with  the  jth  interaural  difference  of  time  of  the  distractor.  Assume  that  the 
decision  variable  used  by  listeners  is  a  weighted  combination  of  the  percepts  ari«ng  Iftom  the 
target  and  the  distractor  dimensions,  with  w^  and  w^  representing  the  wdghts  given  to  the  target 
and  the  distractor  perceptual  dimensions,  respectively.  Left-leading  signals  produce  native 
values  of  the  percept  and  right-leading  signals  produce  positive  values.  Listeners  respond  "Right" 
if: 

(WjX,  +  WdYj)>C  and  "Left"  if  (w^Xj  +  WdYj)<C,  (2) 

vdiete  C  is  the  dedsitHi  ohoion  used  for  making  left  and  right  re^nses  on  the  baas  of  the 


decision  variable. 


Yj  =  (-Wt/Wd)X|,  +C/Wd  (3) 

aiKl  the  slope  of  the  lin^  boundary  between  left  and  right  responses  is  the  ratio  of  the  weights 
given  to  the  two  perceptual  dimensions  (m  =  -w^Wq).  Often  it  will  be  convoiient  to  normalize  the 
weights  so  that  Wj  +  Wj,  =  1 .0,  so; 

Wj  =  m/(m-l)  and  Wp  =  1/(1-  m),  (4) 

the  y-intercept,  multiplied  by  Wq,  provides  an  estimate  of  the  decision  criterion,  C.  Analytic 
performance  will  be  associated  with  w^'s  near  1 .0.  Synthetic  performance  will  be  associated  with 
Wj's  near  0.5,  reflecting  equal  weighting  of  the  target  and  distractor. 

Without  making  assumptions  about  the  shapes  of  the  perceptual  distributions  that  result 
from  presentations  of  left-leading  and  right-leading  targets  (distributed  uniformly  and  discretely  in 
stimulus  space),  little  can  be  said  about  the  shape  of  the  boundary  betweoi  left  and  right 
responses.  However,  multivariate  normal  distributions  provide  good  models  for  many  naturally 
occurring  categories.  As  long  as  the  covariances  associated  with  the  "left”  and  "right"  perceptual 
distributions  are  the  same,  the  optimal  boundary  between  them  will  be  linear.  As  such,  the 
assumption  that  the  decision  bound  is  linear  appears  to  be  a  reasonable  one. 

Best  linear  boundaries  were  found  with  an  algorithm  that  minimizes  the  sum  of  the 
Euclidian  distances  between  the  boimdary  and  misclassified  responses  ("left"  responses  to  the 
right  of  boundary,  "right"  responses  to  the  left  of  tl»  boundary).  Bounthuies  associated  with 
target  weights  ranging  from  0.0  to  1 .2  in  steps  of  0.01  were  assessed  at  values  of  C  ranging  from 
-90  to  +90  /iS.  Although  this  particular  algorithm  does  not  seek  to  maximize  classification 
accuracy,  it  generally  yielded  boundaries  that  feU  within  the  range  of  those  that  produced  the 
highest  percentage  of  correct  classifications. 

C  SUBJECTS  AND  TRAINING. 

All  nine  listeners  had  extensive  prior  experience  in  lateralization  experiments.  Prior  to  data 
collection,  all  listeners  received  at  least  15-18  hours  of  tnuning  during  i^ch  they  lateralized  753- 
Hz  tones  in  isolation  as  well  as  in  the  presence  of  a  553-Hz  distractor. 

m.  AMPLITUDE  MODULATION 

A  STIMULI 

For  the  modulation  work,  two  stimuli  were  presoited  per  trial,  the  cue  and  the  test 
stimulus.  The  cue  consists  of  the  amplitude-modulated  target  and  is  defined  as: 

S^t)  =  ( 1  +m,sin(2x^Xcos(2ic40>  (5) 

with  ^M^^arget  modulation  frequency,  4=taiget  carrier  fi:equency,  and  m,»taiget  dqrth  of 
modulation. 

The  test  stimulus  conrists  of  the  anqrlitude-modulated  target  and  an  amj^de  modulated 
distractor  and  is  defined  as; 


Sx(t)  =  (l+m,sin(2nfUXcos(2*4f)  +  (l+nvisin(2x^Xcos(2it4|t), 


(6) 


with  4id=<li^ractor  modulation  frequency,  4i=^>stractor  carrier  frequency,  and  nvi=distractor 
depth  cf  modulation. 


The  target  depth  of  modulation  (mj  in  the  cue  stimulus  was  always  -IS  dB  in  terms  of  20 
log  m^  but  in  the  test  stimulus  it  and  the  distractor  depth  of  modulation  (20  log  mj  could  take  on 
one  often  values  (-22.5,  -21.0,  -19.5,  -18.0,  -16.5,  -13.5,  -12.0,  -10.5,  -9.0,  and  -7.5  dB;  five 
depths  lower  than  -15  dB  and  five  greater  than  -15  dB).  The  carrier  frequencies  (^  and  4i)  were 
either  1000  or  4000  Hz;  and  the  modulation  frequencies  of  the  distractor  (4iJ  w^e  4,  8,  16,  32, 
and  64  Hz,  while  the  modulation  rate  of  the  target  (4i)  was  always  16  Hz.  All  stimuli  were  500 
ms  in  duration.  The  overall  level  of  the  cue  and  test  stimuli  was  75  dB  SPL  and  there  was  a 
continuous  20-dB  Spectrum  Level  broad  band  noise  (filtered  only  by  the  headphones;  a  similar 
noise  has  been  used  in  studies  of  MDI  to  make  it  difficult  for  listeners  to  process  modulation  in  a 
single  frequency  band). 

The  listeners'  task  was  to  decide  if  the  depth  of  modulation  of  the  target  carrier  in  the  test 
stimulus  was  greater  or  lower  than  the  target  when  it  is  in  the  cue  stimulus,  noting  that  the 
distractor  carrier  in  the  test  stimulus  was  modulated  with  one  of  the  ten  depths  of  modulation. 
Feedback  was  provided  consistent  with  the  modulation  depth  of  the  target  in  the  test  relative  to 
that  in  the  cue.  For  instance,  the  target  carrier  in  the  test  stimulus  may  have  been  modulated  at  a  - 
22.5  dB  depth  of  modulation  and  the  distractor  carrier  modulated  at  a  -7.5  dB  depth  of 
modulation.  In  this  case  the  listener  should  respond  "lower”  since  the  target  depth  of  modulation 
is  lower  in  the  test  (-22.5  dB)  than  it  was  in  the  cue  (-15  dB).  To  respond  in  this  way,  the  listener 
would  have  to  be  able  to  ignore  the  fiict  that  the  distractor  carrier  in  the  test  was  modulated  with 
a  greater  depth  of  modulation  (-7.5  dB)  than  the  target  in  the  cue.  That  is,  the  listener  had  to 
perform  analytically  in  attending  to  the  target  carrier  and  ignoring  the  distractor  carrier.  Figure  6 
is  a  schematic  description  of  the  stimulus  condition  and  procedure. 

B^  LISTENERS. 

Four  listeners  with  normal  hearing  were  tested  m  a  four-person,  sound  proof  room. 

There  were  100  trials  per  block,  such  that  on  each  trial  one  of  the  100  combinations  of  10  depths 
of  target  modulation  and  10  depths  of  distractor  modulation  was  presented  each  once.  Eight 
blocks  of  trials  (800  trials)  were  used  to  estimate  the  target  and  distractor  weights. 

Cl  ESTIMATING  THE  WEIGHTS 

In  applying  the  SALT  procedure  to  this  stimulus  situation,  we  followed  the  method 
described  above.  First,  a  linear  boundary,  separating  the  responses,  is  described  by  equations  (2) 
and  (3)  as; 

Listeners  respond:  "Higher"  if  (w^-Xj  +  WdYj)>C  and  "Lower"  if  (w^J^  +  WdY|)<C, 

where  C  is  the  decision  criterion  used  for  making  higher  and  lower  responses  on  the  basis  of  the 
decision  variable  (higher  means  that  the  target  was  modulated  with  a  greater  depth  in  the  test  than 
in  the  cue  stimulus;  and  lower  means  the  opposite).  The  slope  of  the  linear  boundary  between 
lower  and  higher  responses  is  the  ratio  of  the  weights  given  to  the  two  perceptual  c^ensions  (m 

=  -Wt/Wd). 

For  each  condition  and  listener  the  800  trials  in  the  ten  by  ten  response  matrix  were  used 
to  determine  the  linear  boundary  which  best  partitions  the  "high  (H)”  and  "low  (L)"  responses.  As 
explained  above,  the  linear  boundary  is  determined  by  an  algoritlmi  that  minimizes  the  sum  of  the 


Euclidian  (Usances  between  the  linear  boundary  and  misclassified  re^nses  (i.e.,  an  "L"  r^ponse 
to  the  right  of  the  boundary,  "H"  response  to  the  left  of  the  boundary). 

rV.  RESULTS  AND  CONCLUSIONS 

The  results  for  both  studies  showed  that  the  SALT  procedure  produced  weights  that  wo-e 
much  more  consistent  with  subject  reports  than  thrediold  measures.  That  is,  the  change  in  weights 
with  increasing  stimulus  duration  for  the  lateralization  tadc  indicated  that  the  tones  become  more 
separable  (analytic  listening)  as  the  overall  duration  increases.  This  does  not  occur  with  traditional 
measures  of  thresholds.  In  the  amplitude  modulation  task,  the  modulated  tones  become  more 
separable  (analytic  listening)  as  the  modulation  rates  of  the  target  and  distractor  differ.  Again 
traditional  measures  using  thresholds  do  not  show  the  same  trend.  Subjects  also  report  that  the 
modulated  tones  are  different  when  the  rates  differ. 

In  attempting  to  describe  how  listeners  process  the  many  sound  sources  that  make  up 
most  of  our  everyday  listening  environments,  we  often  need  to  determine  if  they  are  analytic  or 
synthetic  in  proces^g  the  spectral  and  temporal  acoustic  information.  The  SALT  procedure 
offers  a  new  method  for  determining,  on  a  listener  by  listener  basis,  the  degree  to  which  a  set  of 
acoustic  parameters  is  being  processed  analytically  or  synthetically.  By  being  able  to  describe 
listeners'  performance  in  this  way,  we  hope  to  be  ^le  to  more  completely  describe  the  process  of 
hearing. 
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